Journal of Molecular Evolution — Latest Matching Preprints

1

Evolutionary Stratification of Codon Usage Bias In Plants Arises from GC3 Composition and Translational Optimization

Mohanta, T. K.

2026-07-01 genomics 10.64898/2026.06.26.734692 medRxiv

Top 0.1%

9.6%

Show abstract

Codon usage bias is a fundamental genomic characteristic that prefers non-random preferential use of synonymous codons. It is a major determinant of translational efficiency, gene regulation, and molecular evolution. However, the evolutionary bias and functional relevance of codon usage bias across the plant lineage is poorly defined and yet to understand what are the major factors responsible for relative synonymous codon usage (RSCU) in genomes and how codon usage bias influences the gene regulation, molecular evolution genomes. A genome-wide codon usage bias study of coding DNA sequences of 262 plant genome was conducted. It encompassed more than 4.6 billion codons from > 11 million coding sequences. Relative synonymous codon usage, codon adaptation index, codon-anticodon mapping, effective number of codon (ENC)-GC3, GC1,2-GC3, parity rule 2 (PR2-bias), molecular economy, and machine learning approaches were used for the study. It was found that codon usage bias was strongly non-random and exhibited a clear phylogenetic structuring. The higher plants favoured A/T-ending, whereas early-diverging lineages were enriched in G/C-ending codons. Analysis of RSCU, codon adaptation index, and codon-anticodon pairing indicated that translational selection is mediated by tRNA availability, contributing sustainability to these molecular patterns. Machine-learning approaches identified a small subset of codons having outsized influence on genome-wide codon usage landscapes. Further studies revealed the presence of robust inverse relationships between the effective number of codons and GC content at synonymous third positions. Neutrality analysis revealed approximately 61% of variation was driven by mutational pressure, tempered by selective constraints. Phylogenetic reconstruction showed a progressive relaxation of codon bias from algae to angiosperms while maintaining a conserved molecular economy cost of ~ 30 ATP per codon across the lineages. The study revealed codon usage bias is lineage-specific evolutionary conserved trait governed by mutation, selection, and translational optimization.

2

Evo 2's Perception of Single Nucleotide Substitutions in the Genes of Two Plant Model Organisms

Mantegazza, O.; Bertolini, L.; Leoni, G.; Colaiacovo, M.; Petrillo, M.; Bonfini, L.; Savini, C.; Ceresa, M.; Zaoui, X.

2026-07-03 genomics 10.64898/2026.07.01.729829 medRxiv

Top 0.1%

2.9%

Show abstract

Although DNA Large Language Models (DNA-LLMs) offer a path to decoding genetic complexity, our ability to evaluate these models is constrained by our incomplete understanding of the very same genetic syntax and functional logic that these models are trained to learn. In this study we use single nucleotide substitutions that have or have not been observed in living organisms, to evaluate how the DNA-LLM Evo 2 interprets gene sequences from two plant model organisms, Arabidopsis thaliana and Oryza sativa japonica. Using perplexity as a measure of the model's confidence, we observe that alleles containing simulated substitutions are perceived, on average, as less likely than those observed in vivo. Although the size of the effect is modest, the effect is statistically significant and robust, suggesting that Evo 2 is aligned with our current understanding of evolutionary selective constraints. This approach is designed to be model-agnostic and species-agnostic and could serve as a generic framework for evaluating the performance of DNA-LLMs.

3

Location dependence of protein intrinsic disorder in Drosophila melanogaster

Abdulla Daanaa, H. S.; Kuraku, S.; Akashi, H.; Saito, K.

2026-07-03 bioinformatics 10.64898/2026.07.02.732782 medRxiv

Top 0.2%

1.9%

Show abstract

The relevance of protein structural flexibility in function remains contested, but experimental and computational evidence continues to accumulate. Many efforts to address this investigate intrinsic disorder, which commonly refers to peptide segments or entire protein sequences that presumably lack structure and exhibit high flexibility/conformational heterogeneity under physiological conditions. These efforts face challenges such as conflicting computational predictions and ambiguous relationships among intrinsic disorder locations and other protein properties. We address these challenges at a genome-wide scale in Drosophila melanogaster using residue-level predictions for various protein properties. We employ single and consensus approaches to quantify the prevalence of intrinsic disorder and attempt to infer function by testing for differences along protein sequences. Intrinsic disorder is likely more common at terminals than internal regions, and amino acid frequencies can vary substantially between regions in a manner that plausibly reflects functions of intrinsic disorder, rather than only proteome-wide effects. Tertiary structure potentially underlies the prevalence of intrinsic disorder along protein sequences; this prevalence varies more in a putatively solvent-exposed context than a solvent-buried one. Protein-binding appears to be a main function of intrinsic disorder, and we find support consistent with the notion that structural flexibility fosters binding plasticity, and show that location and protein length are factors in this relationship. Nucleic acid-binding and linker are ostensibly less common disorder functions than protein-binding, but nucleic acid-binding seems more localized at terminals. Residue-level estimates of selection pressure indicate that disordered regions generally evolve under weaker sequence constraints than structured regions, except at the N-terminal region. Biases in disorder prediction are a considerable factor in many of the observations, but unlikely a full explanation. The findings strengthen support for functional relevance of flexibility, offer insight into protein architecture and function, and lend impetus for experimental inquiry.

4

Gene model for the ortholog of tgo in Drosophila busckii

Perez, J.; Giunta, A. A.; Wittke-Thompson, J. K.

2026-07-01 genomics 10.64898/2026.06.26.734908 medRxiv

Top 0.2%

1.7%

Show abstract

Gene model for the ortholog of tango (tgo) in the Sep. 2015 (UC Berkeley ASM127793v1/DbusGB1) Genome Assembly (GenBank Accession: GCA_001277935.1) of Drosophila busckii. This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.

5

Gene model for the ortholog of DENR in Drosophila eugracilis

Lawson, M. E.; Sanow, K. A.; Martinand, I.; Fratian, M.; Matura, M.; Rele, C. P.; Reed, L. K.; Thompson, J. S.; O'Rourke, K. S.

2026-06-26 genomics 10.64898/2026.06.23.734050 medRxiv

Top 0.2%

1.7%

Show abstract

Gene model for the ortholog of Density regulated protein (DENR) in the Apr. 2013 (BCM-HGSC/Deug_2.0) (DeugGB2) Genome Assembly (GenBank Accession: GCA_000236325.2) of D. eugracilis. This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.

6

Comparison of directional random walk and weighted least squares modeling of sparse fossil data

Ergon, R.

2026-07-01 evolutionary biology 10.64898/2026.06.26.734751 medRxiv

Top 0.2%

1.7%

Show abstract

The general random walk model (GRW) of Hunt (2006) is used to infer directional evolution in mean trait values from sparse fossil data by modeling phenotypic change as the accumulated result of small steps with mean step sizes and step variances. Using simulations and real data cases, Ergon (2026) showed that the step variances can be estimated reasonably well only when the mean trait values have small measurement errors, while for fossil data with realistic measurement errors they appear to be extremely difficult to find, and they are often found to be negative. In the simulations Ergon (2026) assumed that the true phenotypic mean values were known. Here, I essentially repeat these simulations under the assumption that only mean trait values with large measurement errors are known, and based on weighted mean squared error (WMSE) comparisons the conclusion is that weighted least squares (WLS) is a better method than GRW. A second conclusion is that WLS is a better method also in the possibly rare cases with large measurement errors where the GRW parameters are estimated well. The GRW method is simply not flexible enough to handle such cases. A third conclusion is that Akaike Information Criterion (AIC) results for GRW models with large measurement errors relative to the step variance may be overly optimistic.

7

Gene model for the ortholog of raptor in Drosophila erecta

Backlund, A. E.; Nielsen, J.; Pulford, J.; Cook, B.; Anderson, J.; Robert, M.; Thompson, J. S.; Rele, C. P.; Wittke-Thompson, J. K.

2026-07-14 genomics 10.64898/2026.07.09.737526 medRxiv

Top 0.2%

1.5%

Show abstract

Gene model for the ortholog of raptor in the May 2011 (Agencourt Dere_CAF1/DereCAF1) Genome Assembly (GenBank Accession: GCA_000005135.1) of Drosophila erecta. This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.

8

Generative continuous time model reveals epistatic signatures in protein evolution

Pagnani, A.; Barrat-Charlaix, P.

2026-07-10 bioinformatics 10.1101/2025.09.17.676821 medRxiv

Top 0.3%

1.3%

Show abstract

Protein evolution is fundamentally shaped by epistasis, where the effect of a mutation depends on the sequence context. As standard phylogenetic methods assume independently evolving sites, there is a need for more complex models based on accurate estimations of the fitness landscape. Good candidates are modern generative models -- such as the Potts model -- which successfully capture epistatic effects. However, recent work on generative evolutionary models usually use discrete time, making them difficult to integrate with the standard frameworks in evolutionary biology. We introduce a continuous-time sequence evolution model using the Gillespie algorithm and parameterized by a generative Potts model. This approach enables us to simulate realistic, family-specific evolutionary trajectories and allows for direct comparison with independent-site models. Surprisingly, we find that while epistasis significantly slows down evolution, it does not change the average evolutionary rates at individual sites. This is explained by the rate heterogeneity caused by context-dependence: we show that the rate at some positions varies between null to high values depending on the context, while other positions are essentially independent from the context. Finally, we show that epistasis leads to a systematic underestimation bias in the inference of evolutionary distance between sequences. Overall, our work provides a new tool for simulating realistic protein evolution and offers novel insights into the complex interplay between epistasis and evolutionary dynamics.

9

Gene model for the ortholog of raptor in Drosophila grimshawi

Lieser, B. C.; Lose, B.; Kiser, C. A.; Butterfield, S.; Laschober, L.; Laskowski, L. F.; Nielsen, J.; Pulford, J.; Thompson, J. S.; Rele, C. P.; Wittke-Thompson, J. K.

2026-07-11 genomics 10.64898/2026.07.07.737051 medRxiv

Top 0.3%

1.1%

Show abstract

Gene model for the ortholog of raptor in the D. grimshawi May 2011 (Agencourt dgri_caf1/DgriCAF1) Genome Assembly (GenBank Accession: GCA_000005155.1) of Drosophila grimshawi. This ortholog was characterized as part of a developing dataset to study the evolution of the Insulin/insulin-like growth factor signaling pathway (IIS) across the genus Drosophila using the Genomics Education Partnership gene annotation protocol for Course-based Undergraduate Research Experiences.

10

Convergent evolutionary selection unravels the genetic basis of audition in moths

Cinel, S. D.; Flattmann, Q.; Earl, C.; Ellis, E.; Barber, J.; Sondhi, Y.; Mhatre, N. D.; Kawahara, A. Y.

2026-07-10 evolutionary biology 10.64898/2026.07.08.736348 medRxiv

Top 0.4%

0.8%

Show abstract

Hearing in Lepidoptera mediates a range of ecologically important behaviours, including mate communication, predator avoidance, and acoustic signalling. In moths, the evolution of predator-prey interactions with bats has further shaped hearing through a sensory arms race, with repeated co-option of auditory organs to detect and evade echolocating predators. Despite significant prior characterization of the neurophysiology and behaviour of hearing in moths, the genetic basis of hearing is poorly understood in most insects. In this study, we identify a core set of putative auditory genes in Lepidoptera using a combination of homology-based searches from Drosophila and evolutionary rate analyses. We find 56 genes present across all species and investigate whether gene copy number varies among non-hearing and hearing lineages and among 3 different ear types. We discovered seven genes associated with ear type and one with ear presence, but did not find significant losses in gene copy number in non-hearing species. We identified three genes (btv, Dnai2, and nompB) with strong evidence of selection in hearing clades and five genes with weaker evidence of selection. We discuss the potential roles of btv, nompB, and Dnai2 in ciliary transport and the aging of hair cells, as well as the possibility of actively amplified hearing. Our study serves as a primer and resource for further gene mining and functional testing of auditory genes in moths and other insects.

11

Model-free inference of evolution from allele frequency timeseries using permutation tests

Bertram, J.; Kushnir, A.

2026-07-03 evolutionary biology 10.64898/2026.07.01.735864 medRxiv

Top 0.5%

0.6%

Show abstract

Allele frequency (AF) timeseries allow us to directly observe the dynamics of evolution at a genetic level. However, extracting useful inferences from AF timeseries has proved difficult due to the model uncertainties and noisiness inherent in AF change at fine temporal scales. Here we present three new permutation tests --- which do not assume a model of evolutionary change or a parametric statistical model --- to detect AF timeseries features of evolutionary interest. The features identified by these approaches are: 1) any evolutionary change (as opposed to apparent change due to measurement error); 2) directional selection; 3) fluctuating selection with a propensity to change sign (negative autocorrelation). We are not aware of existing tests for features 1 and 3. Feature 2 is commonly tested using standard evolutionary models such as the Wright-Fisher; we show that the permutation approach has comparable statistical power. We apply our new approaches to AF timeseries data from D. melanogaster and D. pulex.

12

Evolution of mutation rates in digital genomes: the roles of genetic drift, mutational supply, and genome size

Fernandez de Grado, Q.; Frenoy, A.

2026-07-03 evolutionary biology 10.64898/2026.07.03.736272 medRxiv

Top 0.6%

0.6%

Show abstract

Mutation is the ultimate mechanism that produces genetic novelty, and thus a central ingredient of evolution. Mutation rates are therefore thought to be tuned by natural selection, for example to optimize a delicate balance between the generation of adaptive diversity and the accumulation of deleterious mutations. As this selection occurs over very long time scales, models and simulations have been powerful tools to understand how mutation rate evolves and which factors influence it. Most simulation methods are nevertheless limited by the over-simplicity of the genotype-to-phenotype map they feature, especially regarding the encoding of mutation rate. We modified Aevol, an evolutionary simulator inspired by bacterial genomics with a realistic genome structure and a complex genotype-to-phenotype layer, to allow organisms to evolve genes coding for higher replication fidelity. This setup permits several degrees of realism absent in other models: mutation-rate modifier genes themselves experience a realistic distribution of effects of mutations and diminishing- returns epistasis, similarly to fitness modifiers. Moreover, a lower mutation rate comes with the trade-off of a larger genome to encode the genes improving replication fidelity. We use this setup to test hypotheses regarding the evolution of prokaryotic mutation rate, and its link with genome size and genetic drift. We found that evolution systematically increases replication fidelity, even when this results in lower fitness. We highlight two factors which limit the mutation rate decrease: genetic drift and the supply of gain-of-fidelity mutations.

13

Expression patterns and interaction profiles of heterotrimeric transducin subunits in the retina of the European robin (Erithacus rubecula)

Vujinovic, S.; Forst, J. J.; Kulkarni, S.; Güzelsoy-Flügge, U.; Langebrake, G.; Bunger, T.; Scholten, A.; Mouritsen, H.; Liedvogel, M.; Dedek, K.; Koch, K.-W.

2026-07-09 molecular biology 10.64898/2026.06.29.735184 medRxiv

Top 0.6%

0.5%

Show abstract

The heterotrimeric G-protein transducin (Gt) is among the key proteins mediating phototransduction in rod and cone cells of the vertebrate retina. Even though this protein has been extensively characterized in mammals, little is known about its expression patterns in migratory songbirds. Here we characterised Gt expression in the European robin, a night-migratory songbird known for its light-dependent magnetoreception. The mechanism underlying magnetoreception is not fully understood, but one well-supported hypothesis involves a radical-pair formation in the blue light receptor cryptochrome type 4a. The - and {gamma}-subunits of cone specific transducin have been identified as possible interaction partners of cryptochrome 4a. Therefore, we analysed the expression patterns of various G-protein subunits in bird photoreceptors. Specifically, we combined single cell RNA sequencing and immunohistochemistry, and tested for protein interaction by pulldown, co-immunoprecipitation, and NanoBiT luminescence assays. We show that genes for G-protein subunits GNB1 and GNB3 (coding for Gt{beta}1 and Gt{beta}3, respectively) are predominantly expressed in rods and cones. Among {gamma}-subunits, GNGT2 (coding for Gt{gamma}T2) was the principal isoform in cones, whereas GNG11 (coding for Gt{gamma}11) was associated with rods. In contrast, we did not detect GNG10 (coding for Gt{gamma}10) expression in either photoreceptor type. Interaction assays demonstrated that all three {beta}{gamma} combinations; {beta}{gamma}T2, {beta}{gamma}10, and {beta}{gamma}11, can associate in vitro. These findings indicate that {beta}{gamma} dimer formation in vivo is likely constrained by the photoreceptor-specific expression of the respective subunits. Furthermore, the absence of GNG10 expression in rods and cones does not support a role of this {gamma}-subunit in photoreceptor-based magnetoreception.

14

Novel Drosophila cis-regulatory elements can be uncovered by footprinting transcription factor binding sites in ATAC-seq data

Mei, C.; Ness, J.; Nakai, K.; Wunderlich, Z.

2026-06-25 genomics 10.64898/2026.06.22.733832 medRxiv

Top 0.6%

0.5%

Show abstract

Developmental processes depend on carefully coordinated gene expression. Expression is modulated by the binding of transcription factors (TFs) to cis-regulatory elements (CREs), like enhancers and promoters. Many computational and experimental approaches have been developed to find CREs, particularly enhancers, in the genome, each with strengths and caveats. Given the increasing availability of ATAC-seq data and methods to find TF binding therein, we hypothesized that we could use TF footprinting tools to find clusters of TF binding events within accessible chromatin that may act as CREs. Using Drosophila anterior-posterior patterning network as a test bed, we used a digital genomic footprinting tool (DGT), TOBIAS, on previously published early embryo ATAC-seq data to characterize the TF footprint landscape of 16 TFs essential for embryonic patterning. Even in this system, with its extensive enhancer annotation, most footprinted TF binding sites lie outside of known enhancers, with intergenic and intronic regions hosting the highest TF footprint count, albeit at low density. To find potential novel enhancers, we identified high-density TF footprint clusters that are highly conserved and overlap with active enhancer histone mark signals. Five high confidence candidates were selected for reporter assay validation and all five were found to drive spatially patterned expression in the embryo. This study shows that even in a highly characterized system, the analysis of footprinted TF binding sites in ATAC-seq data can uncover new regulatory regions and suggests this approach may be helpful in using existing ATAC-seq data to find novel CREs. ARTICLE SUMMARYGiven the increasing availability of ATAC-seq datasets, workflows to exploit the data to uncover new cis-regulatory elements (CREs), including enhancers, are valuable. Using early anterior-posterior patterning in the Drosophila embryo as a test case, we find that previously published transcription factor footprinting tools and ATAC-seq data can be analyzed to yield new candidate CREs. Experimental validation confirms the activity of selected candidate CREs, suggesting that existing data can be analyzed to find novel regulatory elements.

15

Possible function of Hox2 in atrial siphon fusion of the ascidian Ciona

Liu, Y.; Yoshida, K.; Hozumi, A.; Itagaki, K.; Treen, N.; Sakuma, T.; Yamamoto, T.; Endo, T.; Sasakura, Y.

2026-07-14 developmental biology 10.64898/2026.07.13.738359 medRxiv

Top 0.7%

0.5%

Show abstract

The hallmark of sessile adult ascidians is a vase-like shape with a single oral and atrial siphon. Ciona, however, develops two atrial siphons after metamorphosis, which subsequently fuse into one. The mechanisms underlying this fusion are unknown. This study suggests that Hox2 controls this process. Hox2-knockout animals using Transcription-Activator-Like Effector Nuclease (TALEN) retain two atrial siphons throughout their lives. During normal fusion, epidermal cells between the siphons flatten along the anterior-posterior axis. This cellular flattening does not occur in Hox2-knockout animals, suggesting that the shape change in the epidermal cells produces tension, allowing the atrial siphon openings to converge at the midline for fusion. Hox2-knockout animals lack cupular organs, which are suspected hydrodynamic sensors in the internal epithelium of the fused atrial siphon and on the sperm duct. Among several knockout attempts, atrial siphon fusion was reproduced by only one TALEN pair, suggesting that this phenotype is driven by a mutation having a broader effect than those abolishing protein function. Many ascidians, unlike Ciona, develop a single atrial siphon shortly after metamorphosis. Our findings suggest that a phylogenetically conserved gene, Hox2, establishes this group-specific atrial siphon formation mechanism in Ciona.

16

The contribution of recent and historical demographic histories to genomic diversity and conservation status in plant species

Tao, T.; Li, P.; Zhu, Y.; Zhang, S.; Zhang, M.; Lascoux, M.; Chen, J.

2026-06-29 evolutionary biology 10.64898/2026.06.24.734111 medRxiv

Top 0.9%

0.4%

Show abstract

Demographic factors are intrinsically crucial to evaluate species' extinction risk. However, measuring them remains difficult and time-consuming and the use of genomic summary statistics has been advocated to assess the conservation status of a species. In the present study, we estimated (i) the census number (Nc), (ii) effective population size (Ne) over three different time periods, recent, historical and ancient, (iii) neutral genetic diversity ({pi}4), and (iv) a measure of the efficacy of purifying selection ({pi}0/{pi}4) for 101 plant species using population genomic sequencing data. Twenty-one species are from the Plant Species with Extremely Small Populations (PSESP) program of SW China. Threatened species exhibited significantly lower Ne, Nc, {pi}4, and weaker purifying selection, but had a higher Ne/Nc ratio than non-threatened ones. Nc was the main determinant in identifying conservation status, and contemporary neutral genetic diversity was predominantly influenced by historical Ne. In the absence of demographic information, genetic parameters are a good proxy of conservation status, likely because currently threatened species also had a low historical population size. In summary, our findings suggest that direct estimates of Nc are more useful than {pi}4, although the latter remains a valuable conservation indicator. Hence, efforts such as the PSESP should be extended.

17

Phylogenetic correlation between the Type IV Secretion System and HIP1 suggest an adaptation for horizontal gene transfer conserved at the phylum level

Rodriguez-Cruz, U.; Moreno-Hagelsieb, G.; Abreu-Goodger, C.; Martinez-Guerrero, C.; Delaye, L.

2026-06-30 evolutionary biology 10.64898/2026.06.24.733823 medRxiv

Top 0.9%

0.4%

Show abstract

Most cyanobacterial genomes are rich in the GCGATCGC octamer, also known as Highly Iterated Palindrome 1 (HIP1). Despite its description over three decades ago, the biological function of this highly abundant sequence is only beginning to be elucidated. HIP1 is recognized by two DNA methylases, DmtA and DmtC, and is characterized by its evolutionary conservation and a quasi-periodic distribution within genomes. However, whether the phylogenetic distribution of HIP1 correlates with the presence of functional categories of protein families remains unknown. Here we investigated whether certain protein families share a phylogenetic distribution with this abundant palindromic sequence across cyanobacterial genomes. Our analysis revealed a strong phylogenetic correlation between several proteins of the Type IV secretion system (T4SS) and the abundance of HIP1. This finding aligns with recent discoveries demonstrating that HIP1 enhances DNA transformation in a methylation-dependent manner in two distinct cyanobacterial species. Consequently, we hypothesize that HIP1 function as a conserved adaptation for horizontal gene transfer (HGT) at the phylum level, potentially by serving as a DNA-uptake recognition sequence in cyanobacteria. Significance statementScientists have long been baffled by the HIP1 sequence, a short, highly common, repetitive DNA pattern found across almost all cyanobacterial genomes. Our study used a whole-genome evolutionary approach and found that the presence of this repetitive pattern is tightly linked to the presence of a cells external DNA uptake system. This tight co-evolutionary relationship suggests that HIP1 isnt just random genomic feature, but a conserved evolutionary adaptation used by the entire cyanobacterial phylum to specifically enhance their ability to acquire new genes from one another.

18

Differential selection between sexes and the evolution of recombination in haplodiploids

Patel, V.; Roze, D.

2026-07-03 evolutionary biology 10.64898/2026.06.29.735359 medRxiv

Top 0.9%

0.3%

Show abstract

Eusocial Hymenoptera present the highest known recombination rates among metazoans, which evolved several times independently among bees, ants and wasps. Several hypotheses have been proposed to explain this observation, including stronger selection for recombination caused by coevolving parasites and pathogens, and strong sexual selection among haploid males due to male-biased sex ratios among reproductive individuals. In this article, we explore the effects of haplodiploidy and differential selection between sexes on the evolution of recombination, by analyzing a three-locus model in which selection for recombination stems from negative epistasis between selected loci. Our analytical predictions are compared with the results of individual-based simulations in which deleterious mutations occur along a linear chromosome. Our results show that, at mutation-selection balance for deleterious alleles, increasing the strength of selection against deleterious alleles (due to the effect of male haploidy and/or sexual selection) tends to reduce selection for recombination. However, an increase in the overall magnitude of negative epistasis (which may also be due to male haploidy and/or sexual selection) combined with the fact that recombination only occurs in females may increase selection for recombination substantially. Our model also shows that, in conditions favoring recombination, increasing recombination in meioses leading to parthenogenetic ovules (and male offspring) may yield stronger benefits than in meioses leading to fertilized ovules (and female offspring).

19

Mapping pathogenic patterns in membrane transporters from the GLUT transporter family

Kadasova, N.; Martinat, D.; Spackova, A.; Hutarova Varekova, I.; Berka, K.

2026-07-02 bioinformatics 10.64898/2026.06.28.735151 medRxiv

Top 1%

0.3%

Show abstract

Significance Missense mutations can lead to pathological effects in human cells. Predictive methods that account for structural context, such as AlphaMissense, can provide pathogenicity scores. The accumulation of pathogenicity hotspots can reveal important structural features within individual proteins of protein families, such as GLUT transporters. Mapping pathogenicity scores onto the structure can thus provide a mechanistic explanation of the protein function necessary for its role in the cell. Abstract Non-synonymous amino acid substitutions (missense mutations) are common in the general population; some are causative of serious disease. Depending on their structural context, they can disrupt protein function, folding, or dynamics. Computational predictive methods developed in recent years, such as AlphaMissense, provide new insights into how missense mutations affect protein structure by predicting and mapping their pathogenicity across each amino acid in the human proteome. In this study, we identify recurring patterns of pathogenicity prediction across the GLUT family membrane transporters encoded by genes slc2a1-14. Within the GLUT transporter family, we observe higher pathogenicity profiles in the transmembrane domains, particularly in pore-lining and binding-site residues. Predicted missense pathogenicity is elevated throughout residues assigned to the central cavity, suggesting sensitivity of the transport pathway. Another finding shows higher pathogenicity in specific transmembrane helices of the protein, with the same pattern across all proteins. On the other hand, we observed lower pathogenicity values in some representatives of the GLUT family. These findings show that the pathogenicity of glucose transport within the GLUT family may be shaped by functional redundancy and physiological essentiality across GLUT groups.

20

The effect of genome organisation on selection efficiency in two contrasted plant species

James, J.; Lascoux, M.

2026-07-15 evolutionary biology 10.64898/2025.12.19.695387 medRxiv

Top 1%

0.3%

Show abstract

Does the distribution of fitness effects of new mutations vary across the genome? Under the classical Fisher Geometric Model (FGM) we might not expect it to. In FGM, phenotypic traits are envisioned as dimensions of a landscape, with fitness determined by position in the landscape, i.e., the particular combination of traits of an individual. New mutations are represented by vectors that move from an ancestral to a new phenotype. In classical FGM these vectors affect all trait dimensions simultaneously (universal pleiotropy). However, introducing partial and modular pleiotropy into an FGM framework leads to an expectation that parameters of the DFE will vary with mutational pleiotropy-the number of traits affected by individual mutations. Here we address this prediction by investigating whether traits related to mutational pleiotropy, expression level and network connectivity, affect the parameters of the DFE using whole genome data from A. thaliana and C. grandiflora, two closely related Brassica species that vary significantly in their demography and mating system, and therefore, in effective population size and the effects of linked selection. Results were similar across both species. We found that expression level and network connectivity were predictive of the parameters of the deleterious DFE, even once co-correlations among genome biology traits were accounted for. Our results suggest that, across the genome, molecular evolutio(high mutational pleiotropy). nary patterns agree with the predictions of FGM, albeit relaxing the assumption of universal pleiotropy, and that variation in mutational pleiotropy among genes is sufficient to have detectible effects on the DFE. Significance statementHow do the effects of new mutations vary across the genome? If mutations in some genes affect many traits (high mutational pleiotropy), we hypothesise they will be more strongly deleterious, with lower variance in their selective effects. We test this by investigating the distribution of effects of new mutations across genes that vary in features that are related to mutational pleiotropy: expression level, gene network connectivity, and number of associated GO terms. The mean strength and coefficient of variation of selection of new mutations varied across genes with different features in the manner expected by our hypothesis. This demonstrates that important parameters of molecular evolution can vary across the genome with genome architecture.